4:30 pm
>> so many places to go and so many ways to get there. sometimes there are too many choices. how do we count all those things and en organize them so that we can make sense of it all?
4:31 pm
by looking to the mathematical world of combinatorics. meet charlie, the regional claims adjuster for an insurance company. on the road as usual, visiting clients, solving problems, and trying to get to five different job sites as quickly and efficiently as possible so he can get home to his daughter sarah's birthday party. his problem: finding the best way to organize his trip, taking into consideration the distance between each client, possible time spent with each client, and the most efficient route. sometimes the number of possibilities seems overwhelming, even when we're faced with only a few choices, but we have to make decisions anyway. did he make the right choice? let's count the ways. counting things seems so simple. children do it intuitively, connecting a thing with fingers to say how many. it's the way we organize the stuff of our lives, from bugs to errands. and finding efficient and interesting ways to organize
4:32 pm
things and information is what the field of mathematics known as combinatorics is all about. >> children: [ in unison ] as i was going to st. ives, i met a man with seven wives. each wife had seven sacks -- >> hey, hey. hold on, kids. going somewhere are you? >> all: yeah. >> all right, well, make sure you return, okay? >> all: okay. as i was going to st. ives, i met a man with seven wives. each wife had seven sacks... >> even with the fairly simple system of colored buttons, we've already organized information in a relatively sophisticated way. and we started with a common, intuitive form of organization: the act of counting, or enumeration...enumerating, or listing, the children and then encoding that enumeration with these buttons. but there are lots of ways in which we could have assigned children to buttons, and they encode a variety of levels of
4:33 pm
information, each corresponding to counting different things about the children. abstractly, in using the buttons, i've turned the children into a mathematical object, a set. and the fact that it can be a little easier to count buttons in a jar -- or count properties of a set of objects -- than it is to count kids running around the neighborhood is just the smallest of hints at the power of combinatorics. counting things, that's what combinatorics is all about. as an individual field of study, it's a relatively new one. and these days, with its strong connection to the world of computing, by providing insights into how to best organize and understand the power of the computer, it's incredibly important for modern technology. however, as cutting edge as the subject may be, its basic concerns go back to puzzles and problems from the earliest recorded evidence of mathematical thought. now, this is a facsimile of the rhind papyrus, copied around 1850 b.c. by the scribe ahmes
4:34 pm
from the now-lost text of an earlier dynasty. it was named after a scottish antiquarian, alexander henry rhind, who bought it in 1858 in luxor, egypt. in it, he found what amounts to a math textbook full of problems and solutions that addressed everyday issues, things like how to divide ten loaves of ea equally among seven men and much of the algebra needed to build the pyramids. it also contains possibly the earliest written combinatorics problem, and it goes like this: seven houses contain seven cats. each cat kills seven mice. each mouse had eaten seven ears of grain. each ear of grain would have produced seven hekats of wheat. what is the total of all these? a fairly simple math problem... and a very pragmatic combinatorial argument for getting more cats. in fact, it's not that far removed from the st. ives riddle our children at the gate were talking about. both of these problems have elegant solutions by the recognition of the simple
4:35 pm
pattern of the powers of seven that we see. now, several centuries later, we find slightly more sophisticated kinds of problems and solutions in 6th-century b.c. india. the medical treatise sushruta samhita gives us this problem: how many combinations can be made out of six different tastes: bitter, sour, salty, sweet, astringent, and hot? by taking them one at a time, two at a time, three at a time, and so on, we learn that there are in fact 63 combinations. this was important in ayurvedic medicine because it gave ancient physicians a way to work out which medical substance would best match a patient's ailments. now, there are two things that are going on here. on the one hand, we're asking for the number of possible combinations, or subsets, of the six different spices. this number, as we see, is 63, which is 2^6 - 1. now, notice the sixth power and six spices. there's a pattern here: number
4:36 pm
goes to power. the magic is that we're seeing a formula, a quick way to answer the question of how many combinations of spices without actually listing them. now, there's a more particular problem that was of interest to the indian doctors, and that was the individual questions of how many combinations of one spice, how many of two, and so on and so on. again, we could simply solve it by listing them all, but much more interestingly, there's a pattern, or a formula. and 300 years later, indian scholars wrote the algorithm for solving this, again without counting, an algorithm for finding that pattern, and in this gave us one of the earliest written statements of combinatorial rules and not simply problems. so we jump ahead now a few centuries and find mathematician and scientist blaise pascal, who in 1655 rediscovered this combinatorial rule in what was later named pascal's triangle. this triangular array of numbers
4:37 pm
is built in a fairly simple way. we start with a 1 at the top and imagine that there's a 0 on either side of the 1. we're going to call this row 0. now we add pairs of adjacent numbers and write their sum between them on the next row: 0 + 1 = 1, 1 + 1 = 2, 1 + 0 = 1. we've completed row 1. now we go to row 2 and we do the same thing. we have a 0 and a 1, so beneath it we put a 1. now we have a 1 and a 1. in between them and beneath it we put a 2. now we have a 1 and a 0, and in between those and beneath it we put a 1. tacking zeroes onto either side, we've now built row 2. and so on and so on and so on. now lop off the zeroes on either side of the triangle. what you see there is what we've seen before, pascal's triangle. let's say it's your cousin louie's birthday and he's crazy about hats. you're considering a birthday gift of five hats, but you can
4:38 pm
only afford to buy two, and you can't decide which ones. so the problem is, how many different combinations of two hats are there, and that way you can know how many pairs you have to consider. when we're working with pascal's triangle, we can actually read this information off the triangle. so now what we need to do is go to row 5. and remember, we started numbering the rows by 0, so we actually have to count down six rows to row 5. and again, the columns also start with 0, and we go in to the second column, which actually requires 0, 1, 2, so the third entry. and there it is. the number we find is 10, and that's the right answer. and this number, the number of ways to choose two things out of five things, a mathematician calls "5 choose 2." so 5 choose 2 is equal to 10. pascal didn't invent all the ideas behind this triangle. he did what all mathematicians do, he built on knowledge from an earlier time and then pushed the boundaries of that knowledge to give us something even more interesting and, as is often the case with mathematics, more
4:39 pm
useful. combinatorial problems and puzzles test our ability to think through problems and logically come up with efficient ways to organize information. >> kits, cats, sacks, and wives, how many were going to st. ives? >> so what's the answer to this riddle? let's find somebody who might help us figure it out. so how do we make sense of the confusion caused by all these kids? well, i'm lucky to have here with me jenny quinn, who's a professor of interdisciplinary arts and sciences at the university of washington-tacoma. jenny, thanks for coming. >> thank you, dan. nice jar. >> well, here are these buttons, and as far as i'm concerned, one button per kid. and i happen to have their pictures here. >> okay. so what we have here are two sets of six things. >> six buttons, six photographs. >> so a set is just a collection of objects. what you did here was you
4:40 pm
created a bijection between the set of buttons and the set of kids, okay? >> sure. one button per child. >> one button per child. one thing from each set matched up, holding hands with something from the other set. and this gives us a lot of information, okay? >> yes. >> so it's a question of how much information we actually want to keep track of. >> i see, because we could assign the buttons in many different ways, in fact. >> yes, because notice each one of these buttons is actually distinct. so if we paid attention to which button went to which child, we would be able to say, if one of the buttons were missing, what child that corresponded to. >> right, right. >> but we might not want to keep all that information because there are 720 possibilities of ways to match up buttons to kids. so we could choose to consider smaller subsets to pay attention to. now, a subset is a collection, a smaller collection of the entire set. and in fact, we could ask that the blue buttons create a subset and they go to the blue-shirted
4:41 pm
kids and the orange buttons go to a subset which go to the orange-shirted kids. >> so it's actually a bijection within a bijection. >> exactly. the problem with these kinds of bijections is, as the number of objects you're considering grows, the number of possibilities becomes unmanageable. and we have to learn how to keep track of more information or we have to pinpoint the information that's really important to us. >> right, so in fact, there are many ways that we could have made this initial bijection between the buttons and the children, the buttons and the photographs? >> oh, absolutely. because there are six different buttons, we had a choice of six buttons to give the first person, then there would be five buttons left to give the second person, four buttons for the third, three, two, one. so the total number of ways of creating this bijection would be the product of all of those: 6 x 5 x 4 x 3 x 2 x 1, which is 6 factorial, or 720. >> and so if there had been more children, i would have needed more buttons, and this way in
4:42 pm
which you're counting seems to grow very, very quickly, right? >> yes, this is what we call a combinatorial explosion. >> combinatorial explosion. >> and it's in fact too much information for us to keep track of, so what we'd like to do instead is decide what information is most important to us. so combinatorics is going to help us organize our thoughts. >> so now these bijections that we've been constructing between the buttons and the kids is in fact just the tip of the iceberg for combinatorics, which is there are much richer bijections, bijections that encode much more subtle correspondences, isn't that true? >> oh, absolutely. in fact, what we can do now is we can return to pascal's triangle and we can see how the information about uncle louie's hats is actually encoded in the triangle. >> and in the geometry and just moving around the triangle, is that right? >> absolutely. >> all right. >> if you recall, we were choosing two hats for uncle louie out of five, right? >> that's right. >> and it corresponds to that position 5 choose 2 right there. so imagine yourself starting at
4:43 pm
the top, at the 0-0 position. now, each time you take a step, you're either going to go to the left or to the right. >> okay. >> and when you take a step to the left, that means the hat in that row is not selected. and each time you take a step to the right, it means that's one of the hats you're going to give uncle louie. so starting from the top and going to the 5-2 position, it's going to take two steps to the right and three steps to the left to end up down in the 5-2 position. >> right, equal to 5 choose 2, as we discussed in the beginning. >> so this actually tells us a little bit more about the triangle itself and the construction of the triangle. because if you think about it, suppose you wanted to pick any position in the triangle and understand where it came from. let's call it "n choose k." >> n choose k for generally the nth row and the kth entry in. >> so the way to think about it is we're trying to create subsets of size k from n elements total.
4:44 pm
and the way we can think about it coming into this triangle through these paths is we either entered this cell from a path going from the right or a path coming from the left. so you can think about this particular cell as being created as the sum of two possibilities, either the -- for the row that it came before, either the object was included or the object was excluded. so the way to include the object is n - 1 choose k - 1 ways, and the way to exclude the object is -- >> n - 1 choose k. >> exactly. >> so that actually encodes what mathematicians would call as a recurrence. >> exactly. >> so this recurrence that we derived really of n choose k is equal to (n - 1 choose k -1) + (n - 1 choose k) is actually an example of a very fundamental combinatorial principle, isn't that right? >> absolutely. and here's my favorite example of a principle, it's called the pigeonhole principle.
4:45 pm
>> ah, sounds great. >> and the reason it's my favorite is because it's very intuitive. so here's the example i'm going to show you. we have five pigeons and they are flying home. these are carrier pigeons, so they're flying to their home. there are only four homes. >> right, so somebody might get left out, is that right? >> unfortunately, if there are more objects than there are boxes and only one thing is in a box, then there's stuff that doesn't get assigned. but to make that pigeon happier, we're going to say that everybody has to be put in a box. and what that's going to do is it's going to force that pigeon into some box that's already occupied. >> right, right. >> so the pigeonhole principle in this case is going to say that if there are five pigeons in four boxes, some box -- no matter how the pigeons distribute themselves among the boxes -- some box is going to contain at least two pigeons. >> right, so some box might actually have three pigeons, right? i mean, that's also possible. >> right, but the box with three pigeons also contains two.
4:46 pm
>> that's right. >> so that's at least two. and in fact, if we take the average of the number of pigeons to the number of boxes, okay, what the pigeonhole principle says is some box contains at least the average. >> so the average number of pigeons per box here would have to -- would simply be the number of pigeons, which is five, divided by the number of boxes, which is four. >> but because we can't have a quarter of a pigeon in some box -- >> right, no pigeons are harmed during the making of this show. >> right. they're discrete objects. we have to round up. so that tells us we're going to get two pigeons per box in this case. >> right, at least two. so these basic principles that we've been outlining, these are numbers that are easy to compute. we've been finding formulas, in fact, computing the numbers of pairs of hats or computing things like the average for the pigeonhole principle. >> in particular because they're small. >> but as we get more possibilities, we begin to experience what we've been calling this combinatorial explosion, and we're going to see an example of that now as poor old charlie gets a new
4:47 pm
itinerary. so let's take a look. charlie seems to have found his solution. at least he's crossing one off, and that's a start. except now his boss has called and added more stops to his itinerary. eight more, to be exact. now he's got 12 places to organize. and it's beginning to look like there are so many possible ways to go, he'll never find an efficient route that will let him get home in time. poor charlie, he really seems to have his hands full here. he's suffering from what we were calling before combinatorial explosion. >> well, poor charlie doesn't have a prayer in the world of answering his question definitively, okay, but there are ways that we can cleverly approach a problem and organize the information so that we can reduce the amount of work that needs to be done. >> oh, is that right? >> here's an example for you. the setup is we have a keypad with ten digits and we want to input a three-digit code to
4:48 pm
unlock the door. >> okay. >> so if you were to do this in sort of a very natural, methodical method, you'd put in 0-0-0, you'd put in 0-0-1. >> make your way through. >> and you test every number from 0 to 9-9-9, okay, and that might take you a while. >> and that's 1,000 codes, and i have to hit -- for each code, i hit it three times, so that's 3,000 pushes, right? >> exactly. okay, now, if you use something called a de bruijn sequence, okay, this is a way of organizing this information so we really reduce the amount of work that we're going to do. we're going to cut the amount of work by a factor of 10. in a de bruijn sequence, we can organize the list of numbers in such a way that every three digits that we want to read is a different set, a different combination. so you might start with 0-0-0, test those first three, and then hit 1, that's going to be 0-0-1 is now going to be tested. >> i see, so i'm sort of sliding a window along.
4:49 pm
first it's 0-0-0, and then i would slide it one thing. in other words, i'd press a 1, so that would mean 0-0-1 is the next code that i was testing. >> exactly. and now by organizing it in this fashion, we've reduced the amount of work by a factor of 10. so if it took five hours to test all 1,000 numbers, it's now going to take on the order of 30 minutes. >> great, math to the rescue. >> so in fact, de bruijn sequences are not only useful to get into a door, it's also being used right now in genomic analysis. >> yes, genomics generally is a big user of combinatorial work, the sort of discrete units of the genome, and we have an example of that here. let's watch. >> i'm terry gaasterland, and i'm a professor of computational biology at university of california-san diego. i am in the marine biology department, and i have collaborators in new york and i have collaborators here in oregon. you can think of dna in many
4:50 pm
different ways. if you're a computer scientist, you can think of it as ones and zeroes or you can think of it as four letters: a, c, g, and t. if you're a biologist, you think of dna as a long, complex molecule that is made of each of the compounds that a, c, g, and t represent. a genome is the specific bundle of dna that's in a particular cell. so we can think of dna as being paragraphs, words, sentences, where a genome would be the specific book that is you. dna sequencing is the process of trying to read the letters that are in the book. so sequencing is the process of fragmenting that dna into little tiny pieces and reading a little bit of each of the fragments. and now you've got all these 500-, 700-letter words and you need to fit them all back together. early on, when people started
4:51 pm
trying to sequence genomes, people came up with the most interesting and innovative ways to do it. then there was the process of directed primer walking. so randomly go into the genome, do a read, and then design a little probe that will put you at the end of that read, and then keep reading. it turned out that the problem there is that genomes are very repetitive, so every one of these ran into combinatorial problems. and it turned out that the idea craig venter was really pushing finessed all of this combinatorics. take the whole entire genome, chop it up into lots and lots and lots of little pieces, and take reads off every one of those, and do so many that you have enough information to fit those pieces back together. for any given book, if you will,
4:52 pm
you have multiple copies of the book, and one you're ripping in half at chapter one and the other you're ripping in half at chapter four. if you do enough of them, it will cohere into one. given this fragment and given this fragment, 200 bases overlap, so we'll put them together and see if we can put the next one in. and if those two overlap and then you have a third one that comes in and it matches, you know you have it right. with a human genome, which is 3 billion a's, c's, g's and t's, you need to be a little smarter. so now you set all sorts of heuristics. so you say, "well, let's take every pair of reads that overlaps by at least, say, 100 letters," and they have the identical 100 letters. the problem is there's always sequencing error. for every one of those comparisons, you're computing all these details about it. so that is the problem that the computer has to solve as it's doing assembly of all of these pieces.
4:53 pm
when i first started in looking at genomes, the only genomes that were being sequenced were one-celled microbes, bacteria and archaea. proteins from one organism would have absolutely no match whatsoever in the next organism over, so there seemed to be proteins that were absolutely unique to each of these organisms. so that made people step back and say, "well, we need to sequence more organisms and fill in all the holes and the gaps in this tree of life." so now we go from one-celled organisms to more complicated organisms with several or many cells or even with differentiated cells like a sea urchin or a human. gene by gene, we can look across the evolutionary tree and ask which gene in sea urchin goes with which gene in zebrafish, and that goes with which gene in human? and by studying when that gene
4:54 pm
turns on and turns off during development or in maintenance of a healthy organism -- in sea urchin -- that tells us something about how it might be working in human. >> wow, so genomics, this is a particular example where combinatorics has really helped us. it's allowed us to simplify a problem and it's really given us a purchase on how to do things. but there are actually problems which are combinatorial in nature which we still don't know how to solve, isn't that right? >> well, like charlie, for instance. >> poor charlie, yep. >> he's in what we call a traveling salesman problem. >> yes, he is. >> the idea being you're given a certain number of cities and you want to travel -- starting from one city, you want to travel to all the cities, getting back to where you started in the most economical fashion. now, if there are a small number of cities, the analysis is not difficult. so if charlie only had three cities, okay, there are only two possibilities: either he goes clockwise from where he is or he
4:55 pm
goes counterclockwise. >> so each route has a cost as well. >> each route has a cost. but now if we add a city to that, okay, his two choices... >> oh, i see, it's starting to grow. >> go to six, okay? if we add another city... >> okay, it's getting worse. >> it goes from 6 to 24. >> i'm seeing a pattern here. is there one, in fact? >> oh, there absolutely is. if we jump to 10 cities, there are now over 350,000 possible organizations to his route. >> and that factorial number is coming in here again, isn't it? >> exactly, so 10 cities gives us 9-factorial possible routes. >> okay, 9 x 8 x 7 and so on. okay. >> if we jump to 25 cities... >> okay, that really looks like a problem. >> there are billions and billions of routes. and no matter how clever we are, there's no way we can look at every single one and find the best or fastest possible route. >> all right, well, jenny, this was great, thanks. >> my pleasure. >> so charlie is left with too
4:56 pm
many possible solutions, more than he can calculate with a simple map and a pad of paper, or even his laptop. but sometimes the path is clear, and sometimes the right way to organize information is the way that leads us home. >> hi, princess. >> daddy! >> ♪ happy birthday to you >> kits, cats, sacks, and wives, how many were going to st. ives? yay! >> our lives are constructed o of seemingly endless possibilities. we make choices so we can be more efficient in what we do and where we go, so we can unlock the secrets of the human genome, or simply solve a math riddle. the answer to the riddle, by the way, is one. i'm the only one going to st. ives. the other combinations we'll save for another time. captions by lns captioning portland, oregon www.lnscaptioning.com
4:57 pm
4:58 pm
4:59 pm
disc Borrow a DVD of this show
In Collection